Goto

Collaborating Authors

 risk minimization


Variance-based Regularization with Convex Objectives

Neural Information Processing Systems

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.


Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions

Neural Information Processing Systems

Error bound conditions (EBC) are properties that characterize the growth of an objective function when a point is moved away from the optimal set. They have recently received increasing attention in the field of optimization for developing optimization algorithms with fast convergence. However, the studies of EBC in statistical learning are hitherto still limited. The main contributions of this paper are two-fold. First, we develop fast and intermediate rates of empirical risk minimization (ERM) under EBC for risk minimization with Lipschitz continuous, and smooth convex random functions. Second, we establish fast and intermediate rates of an efficient stochastic approximation (SA) algorithm for risk minimization with Lipschitz continuous random functions, which requires only one pass of $n$ samples and adapts to EBC. For both approaches, the convergence rates span a full spectrum between $\widetilde O(1/\sqrt{n})$ and $\widetilde O(1/n)$ depending on the power constant in EBC, and could be even faster than $O(1/n)$ in special cases for ERM. Moreover, these convergence rates are automatically adaptive without using any knowledge of EBC. Overall, this work not only strengthens the understanding of ERM for statistical learning but also brings new fast stochastic algorithms for solving a broad range of statistical learning problems.



LOG: ActiveModelAdaptationforLabel-Efficient OODGeneralization

Neural Information Processing Systems

Thisworkdiscusses howtoachieveworst-case Out-Of-Distribution(OOD) generalization for avariety of distributions based on arelatively small labeling cost.





0ebcc77dc72360d0eb8e9504c78d38bd-Paper.pdf

Neural Information Processing Systems

As a consequence, empirical risk minimizers generally perform very poorly in extreme regions. It is the purpose of this paper to develop a general framework for classification in the extremes.



d800149d2f947ad4d64f34668f8b20f6-Paper.pdf

Neural Information Processing Systems

Onthe otherhand,wederivenecessary andsufficientconditions underwhichenforcing algorithmic fairness leads to the Bayes model in the target domain.